-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[LLVM][CodeGen][SVE] List MVTs that are desirable for extending loads. #149153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Extend AArch64TargetLowering::isVectorLoadExtDesirable to specify the set of MVT for which load extension is desirable. NOTE: The PR uncovered two instances where isVectorLoadExtDesirable was called incorrectly by passing in the load node rather than the expected extension. Fixes llvm#148939
|
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-aarch64 Author: Paul Walker (paulwalker-arm) ChangesExtend AArch64TargetLowering::isVectorLoadExtDesirable to specify the set of MVT for which load extension is desirable. NOTE: The PR uncovered two instances where isVectorLoadExtDesirable was called incorrectly by passing in the load node rather than the expected extension. Fixes #148939 Full diff: https://github.com/llvm/llvm-project/pull/149153.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 0e8e4c9618bb2..e87e25f8f816d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -7633,7 +7633,7 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
if (SDValue(GN0, 0).hasOneUse() &&
isConstantSplatVectorMaskForType(N1.getNode(), ScalarVT) &&
- TLI.isVectorLoadExtDesirable(SDValue(SDValue(GN0, 0)))) {
+ TLI.isVectorLoadExtDesirable(SDValue(SDValue(N, 0)))) {
SDValue Ops[] = {GN0->getChain(), GN0->getPassThru(), GN0->getMask(),
GN0->getBasePtr(), GN0->getIndex(), GN0->getScale()};
@@ -15724,7 +15724,7 @@ SDValue DAGCombiner::visitSIGN_EXTEND_INREG(SDNode *N) {
// fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x)
if (auto *GN0 = dyn_cast<MaskedGatherSDNode>(N0)) {
if (SDValue(GN0, 0).hasOneUse() && ExtVT == GN0->getMemoryVT() &&
- TLI.isVectorLoadExtDesirable(SDValue(SDValue(GN0, 0)))) {
+ TLI.isVectorLoadExtDesirable(SDValue(SDValue(N, 0)))) {
SDValue Ops[] = {GN0->getChain(), GN0->getPassThru(), GN0->getMask(),
GN0->getBasePtr(), GN0->getIndex(), GN0->getScale()};
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 4f13a14d24649..2775bdb175556 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -6439,7 +6439,9 @@ bool AArch64TargetLowering::isVectorLoadExtDesirable(SDValue ExtVal) const {
}
}
- return true;
+ EVT PreExtScalarVT = ExtVal->getOperand(0).getValueType().getScalarType();
+ return PreExtScalarVT == MVT::i8 || PreExtScalarVT == MVT::i16 ||
+ PreExtScalarVT == MVT::i32 || PreExtScalarVT == MVT::i64;
}
unsigned getGatherVecOpcode(bool IsScaled, bool IsSigned, bool NeedsExtend) {
diff --git a/llvm/test/CodeGen/AArch64/sve-intrinsics-ldst-ext.ll b/llvm/test/CodeGen/AArch64/sve-intrinsics-ldst-ext.ll
index 4153f0be611a1..9698f1a6768fd 100644
--- a/llvm/test/CodeGen/AArch64/sve-intrinsics-ldst-ext.ll
+++ b/llvm/test/CodeGen/AArch64/sve-intrinsics-ldst-ext.ll
@@ -231,3 +231,27 @@ define <vscale x 8 x i64> @sload_8i8_8i64(ptr %a) {
%aext = sext <vscale x 8 x i8> %aval to <vscale x 8 x i64>
ret <vscale x 8 x i64> %aext
}
+
+; Ensure we don't try to promote a predicate load to a sign-extended load.
+define <vscale x 16 x i8> @sload_16i1_16i8(ptr %addr) {
+; CHECK-LABEL: sload_16i1_16i8:
+; CHECK: // %bb.0:
+; CHECK-NEXT: ldr p0, [x0]
+; CHECK-NEXT: mov z0.b, p0/z, #-1 // =0xffffffffffffffff
+; CHECK-NEXT: ret
+ %load = load <vscale x 16 x i1>, ptr %addr
+ %zext = sext <vscale x 16 x i1> %load to <vscale x 16 x i8>
+ ret <vscale x 16 x i8> %zext
+}
+
+; Ensure we don't try to promote a predicate load to a zero-extended load.
+define <vscale x 16 x i8> @zload_16i1_16i8(ptr %addr) {
+; CHECK-LABEL: zload_16i1_16i8:
+; CHECK: // %bb.0:
+; CHECK-NEXT: ldr p0, [x0]
+; CHECK-NEXT: mov z0.b, p0/z, #1 // =0x1
+; CHECK-NEXT: ret
+ %load = load <vscale x 16 x i1>, ptr %addr
+ %zext = zext <vscale x 16 x i1> %load to <vscale x 16 x i8>
+ ret <vscale x 16 x i8> %zext
+}
|
|
|
| if (SDValue(GN0, 0).hasOneUse() && | ||
| isConstantSplatVectorMaskForType(N1.getNode(), ScalarVT) && | ||
| TLI.isVectorLoadExtDesirable(SDValue(SDValue(GN0, 0)))) { | ||
| TLI.isVectorLoadExtDesirable(SDValue(SDValue(N, 0)))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that fixing this hasn't affected any existing tests! If I understand correctly, the argument passed to isVectorLoadExtDesirable should be the extended value - do you know why we need to wrap this in two SDValues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to fix this to maintain the existing tests. Without this fix most of the sve-masked-gather-*.ll tests started to fail. I guess the nested calls to SDValues was just a typo so have removed them.
david-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Extend AArch64TargetLowering::isVectorLoadExtDesirable to specify the set of MVT for which load extension is desirable.
NOTE: The PR uncovered two instances where isVectorLoadExtDesirable was called incorrectly by passing in the load node rather than the expected extension.
Fixes #148939